Basic Statistics

Raw Counts

Name Value
Rows 9,060
Columns 37
Discrete columns 30
Continuous columns 7
All missing columns 0
Missing observations 1,911
Complete Rows 7,173
Total observations 335,220
Memory allocation 8.4 Mb

Percentages

Data Structure

Missing Data Profile

Univariate Distribution

Histogram

Bar Chart (with frequency)

## 8 columns ignored with more than 50 categories.
## title: 8773 categories
## original_title: 8833 categories
## overview: 9042 categories
## release_date: 5683 categories
## genre_ids: 2042 categories
## poster_path: 9059 categories
## backdrop_path: 9049 categories
## tagline: 7132 categories

QQ Plot

Correlation Analysis

## 9 features with more than 20 categories ignored!
## title: 6952 categories
## original_title: 6981 categories
## overview: 7171 categories
## release_date: 4909 categories
## original_language: 33 categories
## genre_ids: 1786 categories
## poster_path: 7172 categories
## backdrop_path: 7172 categories
## tagline: 7127 categories
## Warning in cor(x = structure(list(id = c(19404, 283566, 278, 238, 424,
## 240, : the standard deviation is zero

Principal Component Analysis

## 8 features with more than 50 categories ignored!
## title: 6952 categories
## original_title: 6981 categories
## overview: 7171 categories
## release_date: 4909 categories
## genre_ids: 1786 categories
## poster_path: 7172 categories
## backdrop_path: 7172 categories
## tagline: 7127 categories
## Warning in plot_prcomp(data = structure(list(id = c(19404, 283566, 278, : The following features are dropped due to zero variance:
##  * adult_FALSE
##  * video_FALSE
##  * Documentary_FALSE